-
Notifications
You must be signed in to change notification settings - Fork 437
Docs: Improve SFT/RL user experience #2794
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
8629e8b to
5ae647f
Compare
c836743 to
66c8566
Compare
d971a0b to
9b41868
Compare
b87dc05 to
4f3a8cd
Compare
|
|
||
| ## Create virtual environment and Install MaxText dependencies | ||
| If you have already completed the [MaxText installation](https://github.com/AI-Hypercomputer/maxtext/blob/main/docs/guides/install_maxtext.md), you can skip to the next section for post-training dependencies installations. Otherwise, please install `MaxText` using the following commands before proceeding. | ||
| If you have already completed the [MaxText installation](../../install_maxtext.md), you can skip to the next section for post-training dependencies installations. Otherwise, please install `MaxText` using the following commands before proceeding. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need to change the link here?
|
|
||
| export RUN_NAME=<name for this run> # e.g., $(date +%Y-%m-%d-%H-%M-%S) | ||
| export MAXTEXT_CKPT_PATH=${BASE_OUTPUT_DIRECTORY}/${RUN_NAME}/0/items | ||
| export MAXTEXT_CKPT_PATH=${BASE_OUTPUT_DIRECTORY}/${RUN_NAME}/0/items # Actual checkpoint saved with an extra /0/items path suffix |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This doesn't look right if user has the checkpoint in a GCS. We can remove this env variable from here and move this to next section, similar to https://maxtext.readthedocs.io/en/latest/tutorials/posttraining/sft.html#get-your-model-checkpoint.
| The overview of what this run will do is as follows: | ||
|
|
||
| 1. We load a policy model and a reference model. Both are copies of `Llama3.1-8b-Instruct`. | ||
| 1. We load a policy model and a reference model. Both are copies of the model checkpoint you specified (e.g., `Llama3.1-8b-Instruct`). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you do the same at line 128?
|
|
||
| ## 2. Install XPK | ||
| Install XPK by following the instructions in the [official documentation](https://github.com/AI-Hypercomputer/xpk?tab=readme-ov-file#installation-via-pip). | ||
| Install XPK by following the instructions in the [official documentation](https://github.com/AI-Hypercomputer/xpk?tab=readme-ov-file#installation-via-pip). We also provide a quick guide for XPK installation and usage [here](https://maxtext.readthedocs.io/en/latest/run_maxtext/run_maxtext_via_xpk.html). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: We also provide a quick guide for XPK installation here
That XPK documentation mainly talks about pre-training. Pointing users to XPK documentation at this point might create some confusion. Can we explicitly say just follow the instruction in that guide for XPK installation & Prerequisite and continue on the current doc for post-training?
| ## Submit your RL workload via Pathways | ||
|
|
||
| Please create a pathways ready GKE cluster as described [here](https://docs.cloud.google.com/ai-hypercomputer/docs/workloads/pathways-on-cloud/create-gke-cluster), and you can submit the `train_rl.py` script via [XPK](https://github.com/AI-Hypercomputer/xpk). | ||
| Please create a pathways ready GKE cluster as described [here](https://docs.cloud.google.com/ai-hypercomputer/docs/workloads/pathways-on-cloud/create-gke-cluster), and you can submit the `train_rl.py` script via [XPK](https://github.com/AI-Hypercomputer/xpk). We also provide a quick guide for XPK installation and usage [here](../../run_maxtext/run_maxtext_via_xpk.md). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similar comment
Description
Reduce user friction in SFT/RL and fix broken links.
b/463394566
b/463409639
b/463409807
b/463396352
b/463393644
Tests
N/A
Checklist
Before submitting this PR, please make sure (put X in square brackets):
gemini-reviewlabel.